CasKAS: direct profiling of genome-wide dCas9 and Cas9 specificity using ssDNA mapping

Detecting and mitigating off-target activity is critical to the practical application of CRISPR-mediated genome and epigenome editing. While numerous methods have been developed to map Cas9 binding specificity genome-wide, they are generally time-consuming and/or expensive, and not applicable to catalytically dead CRISPR enzymes. We have developed CasKAS, a rapid, inexpensive, and facile assay for identifying off-target CRISPR enzyme binding and cleavage by chemically mapping the unwound single-stranded DNA structures formed upon binding of a sgRNA-loaded Cas9 protein. We demonstrate this method in both in vitro and in vivo contexts. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-023-02930-z.


SM 10
Supplementary Figure 16: CasKAS signal in vivo (HEK293 cells, harvested at 48 hours) around the CD298/ATP1B3 gene with two different sgRNA targeting the gene. Note that in this case the gene displays strong native KAS-seq signal around its promoter in HEK293 cells overlapping the sgRNA targeting sites (see the profiles for "safe" sgRNAs not targeting this locus below). With dCas9 KAS-seq signal above the control levels is observed, but not with with active Cas9, suggesting that the processes generating ssDNA at this locus (e.g. association with RNA polymerases) might be displacing the active Cas9; in contrast, continuous reassociation with dCas9 (as the target sequence is not altered by cleavage) maintains the elevated KAS-seq signal signature.

SM 11
Supplementary Figure 17: CasKAS identifies proper off-target sites that are missed by sgRNA prediction algorithms. Shown is in vitro dCas9 CasKAS for the "sgRNA #1" sgRNA. Peaks were called de novo using MACS2, then intersected with Cas-OFFinder off-target prediction, and the outersect was manually filtered to exclude obvious artifacts based on peak shape (e.g. arising from repetitive elements in the genome). (a) Aggregate forward-and reverse-strand profiles around off-target sites predicted by Cas-OFFinder (centered on the sgRNA); (b) Aggregate forward-and reversestrand profiles around sites not predicted by Cas-OFFinder (centered on the MACS2 peak summit); (c) Example UCSC Genome Browser snapshot of a CasKAS read profile around an off-target site predicted by Cas-OFFinder; (c) Example UCSC Genome Browser snapshot of a CasKAS read profile around an off-target site not predicted by Cas-OFFinder. Both predicted and identified through peak calling sites exhibit the expected asymmetric read distribution around a fixed occupancy point (the sgRNA-dCas9 RNP complexed with DNA).

SM 12
Supplementary Figure 18: In vitro dCas9 and Cas9 CasKAS profiles for the "Nanog-sg2" sgRNA. CasKAS profiles are shown for all off-target sites predicted by Cas-OFFinder as well as for the top 1000 sites (ranked by CasKAS RPM values over the ±500bp region around the sgRNA target site).

SM 13
Supplementary Figure 19: In vitro dCas9 and Cas9 CasKAS profiles for the "Nanog-sg3" sgRNA. CasKAS profiles are shown for all off-target sites predicted by Cas-OFFinder as well as for the top 1000 sites (ranked by CasKAS RPM values over the ±500bp region around the sgRNA target site). Supplementary Figure 42: Multiple sequence alignment of off-target sites identified by in vitro dCas9 and Cas9 CasKAS for the "sgRNA #1" sgRNA outside the list of predicted off-targets by Cass-OFFinder. MACS2 peak calls were manually filtered to exclude artifactual peaks, then the sequence of the ±50-bp region around the peak summit was used as input to the multiple sequence alignment, together with the sgRNA itself.

SM 41
Supplementary Figure 52: Amplicon sequencing of DNA edits with the EMX1 sgRNA. HEK293 cells were transfected (in replicates) with the EMX1 or the VEGFA sgRNAs. Genomic DNA was extracted and a total of 81 potential off-target sites for the EMX1 sgRNA were amplicon-sequenced. (a) The indel frequency (the fraction of reads with an indel over a given position) profiles over each such site identify the on-target site as the only position that is edited, concordant with CasKAS results. (b) Genome browser snapshot of indel frequencies over the on-target site (c) An additional site shows high indel frequency, however, it is present at the same rate in all datasets, including the no-guide negative control, indicating that this is an endogenous sequence variant and not an actual in vivo off-target.
Supplementary Figure 53: Amplicon sequencing of DNA edits with the VEGFA sgRNA. HEK293 cells were transfected (in replicates) with the EMX1 or the VEGFA sgRNAs. Genomic DNA was extracted and a total of 52 potential off-target sites for the VEGFA sgRNA were amplicon-sequenced. Very high in vivo indel frequency is observed for the on-target (b) and one of the other sites (d) identified as active Cas9 cutting targets in the in vitro CasKAS. Another site (c) also shows elevated indel frequency. The fourth site (e) does not appear to be a cutting target in vitro. Figure 54: Comparing in vitro dCas9 results to using ChIP-seq and CHANGE-seq for off-target profiling. Shown is the overlap between MACS2 peak calls for the Nanog-sg3 sgRNA with Nanog ChIP-seq dataset (SRR1168384 from GEO accession ID GSE54745) in (a) and the EMX1 sgRNA with EMX1 CHANGE-seq (SRA accession SRX8227890) in (b). The fraction of peaks common or unique to each assay that are predicted to be off-targets for each sgRNA by Cas-OFFinder is shown in (c).

Supplementary
Supplementary Figure 57: Comparing in vitro dCas9 results to using GUIDE-seq for off-target profiling. Shown is the overlap between MACS2 peak calls for the EMX1 and VEGFA sgRNAs with off-target regions defined by the original GUIDE-seq publication 4 .
Supplementary Figure 58: Comparing in vitro dCas9 results to using Digenome-seq for off-target profiling. Shown is the overlap between MACS2 peak calls for the VEGFA sgRNA with off-target regions defined by the original Digenome-seq publication 2 .